strict nash equilibrium
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (5 more...)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (5 more...)
Accelerated regularized learning in finite N-person games
Lotidis, Kyriakos, Giannou, Angeliki, Mertikopoulos, Panayotis, Bambos, Nicholas
Motivated by the success of Nesterov's accelerated gradient algorithm for convex minimization problems, we examine whether it is possible to achieve similar performance gains in the context of online learning in games. To that end, we introduce a family of accelerated learning methods, which we call "follow the accelerated leader" (FTXL), and which incorporates the use of momentum within the general framework of regularized learning - and, in particular, the exponential / multiplicative weights algorithm and its variants. Drawing inspiration and techniques from the continuous-time analysis of Nesterov's algorithm, we show that FTXL converges locally to strict Nash equilibria at a superlinear rate, achieving in this way an exponential speed-up over vanilla regularized learning methods (which, by comparison, converge to strict equilibria at a geometric, linear rate). Importantly, FTXL maintains its superlinear convergence rate in a broad range of feedback structures, from deterministic, full information models to stochastic, realization-based ones, and even when run with bandit, payoff-based information, where players are only able to observe their individual realized payoffs.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (6 more...)
Selection pressure/Noise driven cooperative behaviour in the thermodynamic limit of repeated games
Consider the scenario where an infinite number of players (i.e., the \textit{thermodynamic} limit) find themselves in a Prisoner's dilemma type situation, in a \textit{repeated} setting. Is it reasonable to anticipate that, in these circumstances, cooperation will emerge? This paper addresses this question by examining the emergence of cooperative behaviour, in the presence of \textit{noise} (or, under \textit{selection pressure}), in repeated Prisoner's Dilemma games, involving strategies such as \textit{Tit-for-Tat}, \textit{Always Defect}, \textit{GRIM}, \textit{Win-Stay, Lose-Shift}, and others. To analyze these games, we employ a numerical Agent-Based Model (ABM) and compare it with the analytical Nash Equilibrium Mapping (NEM) technique, both based on the \textit{1D}-Ising chain. We use \textit{game magnetization} as an indicator of cooperative behaviour. A significant finding is that for some repeated games, a discontinuity in the game magnetization indicates a \textit{first}-order \textit{selection pressure/noise}-driven phase transition. The phase transition is particular to strategies where players do not severely punish a single defection. We also observe that in these particular cases, the phase transition critically depends on the number of \textit{rounds} the game is played in the thermodynamic limit. For all five games, we find that both ABM and NEM, in conjunction with game magnetization, provide crucial inputs on how cooperative behaviour can emerge in an infinite-player repeated Prisoner's dilemma game.
- North America > United States > New York (0.04)
- Asia > India > Maharashtra > Mumbai (0.04)
Games played by Exponential Weights Algorithms
d'Andrea, Maurizio, Gensbittel, Fabien, Renault, Jérôme
Many machine learning algorithms used for prediction or decision-making are designed to optimize the behavior of a single agent facing an unknown environment. However, with the increasing use of these algorithms in various fields and the complexity of the problems at hand, interaction between these algorithms, designed for an agent unconscious of other players, have become common. This raises a natural question: where will these interactions lead? Our paper contributes to the large literature on learning algorithms in games. Precisely, we analyze the day to day behavior of the exponential weights (EW) algorithm with constant learning rates, when applied independently by all the players in a finite game. The EW algorithm ([15, 10, 5, 3]) is one of the most popular and widely studied algorithm with applications in various domains and contexts: computational geometry, optimization, operations research, online statistical decision-making, machine learning (we refer to [4, 14, 2] for a general account on the subject).
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States (0.04)
- Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.04)
Taming the Exponential Action Set: Sublinear Regret and Fast Convergence to Nash Equilibrium in Online Congestion Games
Dong, Jing, Wu, Jingyu, Wang, Siwei, Wang, Baoxiang, Chen, Wei
The congestion game is a powerful model that encompasses a range of engineering systems such as traffic networks and resource allocation. It describes the behavior of a group of agents who share a common set of $F$ facilities and take actions as subsets with $k$ facilities. In this work, we study the online formulation of congestion games, where agents participate in the game repeatedly and observe feedback with randomness. We propose CongestEXP, a decentralized algorithm that applies the classic exponential weights method. By maintaining weights on the facility level, the regret bound of CongestEXP avoids the exponential dependence on the size of possible facility sets, i.e., $\binom{F}{k} \approx F^k$, and scales only linearly with $F$. Specifically, we show that CongestEXP attains a regret upper bound of $O(kF\sqrt{T})$ for every individual player, where $T$ is the time horizon. On the other hand, exploiting the exponential growth of weights enables CongestEXP to achieve a fast convergence rate. If a strict Nash equilibrium exists, we show that CongestEXP can converge to the strict Nash policy almost exponentially fast in $O(F\exp(-t^{1-\alpha}))$, where $t$ is the number of iterations and $\alpha \in (1/2, 1)$.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Hong Kong (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > United States > California > Los Angeles County > Santa Monica (0.04)
- Information Technology > Game Theory (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.46)
A Signaling Game Approach to Databases Querying and Interaction
McCamish, Ben, Termehchy, Arash, Touri, Behrouz
As most database users cannot precisely express their information needs, it is challenging for database management systems to understand them. We propose a novel formal framework for representing and understanding information needs in database querying and exploration. Our framework considers querying as a collaboration between the user and the database management system to establish a it mutual language for representing information needs. We formalize this collaboration as a signaling game, where each mutual language is an equilibrium for the game. A query interface is more effective if it establishes a less ambiguous mutual language faster. We discuss some equilibria, strategies, and the convergence in this game. In particular, we propose a reinforcement learning mechanism and analyze it within our framework. We prove that this adaptation mechanism for the query interface improves the effectiveness of answering queries stochastically speaking, and converges almost surely. We extend out results for the cases that the user also modifies her strategy during the interaction.
- North America > United States > Michigan (0.04)
- North America > United States > Oregon (0.04)
- North America > United States > Missouri (0.04)
- (3 more...)
- Information Technology > Information Management > Search (1.00)
- Information Technology > Game Theory (1.00)
- Information Technology > Databases (1.00)
- (3 more...)
Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games
Wang, Xiaofeng, Sandholm, Tuomas
Multiagent learning is a key problem in AI. In the presence of multiple Nashequilibria, even agents with non-conflicting interests may not be able to learn an optimal coordination policy. The problem is exaccerbated ifthe agents do not know the game and independently receive noisy payoffs. So, multiagent reinforfcement learning involves two interrelated problems:identifying the game and learning to play.
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)
Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games
Wang, Xiaofeng, Sandholm, Tuomas
Multiagent learning is a key problem in AI. In the presence of multiple Nash equilibria, even agents with non-conflicting interests may not be able to learn an optimal coordination policy. The problem is exaccerbated if the agents do not know the game and independently receive noisy payoffs. So, multiagent reinforfcement learning involves two interrelated problems: identifying the game and learning to play.
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)
Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games
Wang, Xiaofeng, Sandholm, Tuomas
Multiagent learning is a key problem in AI. In the presence of multiple Nash equilibria, even agents with non-conflicting interests may not be able to learn an optimal coordination policy. The problem is exaccerbated if the agents do not know the game and independently receive noisy payoffs. So, multiagent reinforfcement learning involves two interrelated problems: identifying the game and learning to play.
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)